Return reasoning content from model output#112
Conversation
Protocol conversion should use the returned reasoning payload as the source of truth. MODEL_PARAMETER_RULES can still control model-side thinking, but OpenAI and AG-UI responses should not hide non-empty reasoning_content when the env flag says thinking is false. Constraint: User requested removal of protocol-level thinking_enabled = is_thinking_enabled_from_env() gating. Rejected: Keep env-based response suppression | runtime parameters can drift from the model output and hide returned reasoning. Confidence: high Scope-risk: narrow Directive: Do not reintroduce protocol-level reasoning env gates; only emit reasoning when returned reasoning_content is non-empty. Tested: uv run --python 3.11 --dev --extra server pytest -q tests/unittests/server/test_openai_protocol.py tests/unittests/server/test_agui_protocol.py tests/unittests/server/test_reasoning.py Tested: uv run --python 3.11 --dev --extra server pytest -q tests/unittests/server Tested: uv run --python 3.11 --dev --extra server ruff check agentrun/server/openai_protocol.py agentrun/server/agui_protocol.py tests/unittests/server/test_openai_protocol.py tests/unittests/server/test_agui_protocol.py Tested: git diff --check Change-Id: I638efa7ca19bf8ed9417fb1922d43205d4d52b65 Not-tested: Remote GitHub CI result pending after push.
There was a problem hiding this comment.
Pull request overview
This PR adjusts the OpenAI Chat Completions and AG-UI protocol handlers to surface reasoning_content based on actual model output (non-empty) rather than gating at the protocol layer via MODEL_PARAMETER_RULES. It also strips transport-only additional_kwargs.reasoning_content from emitted protocol payloads while still promoting it into the standardized reasoning field/event when present.
Changes:
- Removed protocol-layer
MODEL_PARAMETER_RULESgating for reasoning emission in OpenAI and AG-UI protocol conversions. - Promoted
reasoning_contentfromadditional_kwargs(and stripped the nested transport field), emitting reasoning only when non-empty. - Updated unit tests to assert reasoning is included even when
MODEL_PARAMETER_RULESdisables thinking.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
agentrun/server/openai_protocol.py |
Removes protocol gating; promotes and cleans reasoning fields in OpenAI streaming/non-stream responses. |
agentrun/server/agui_protocol.py |
Removes protocol gating; emits AG-UI reasoning events when non-empty and strips transport-only reasoning fields from additions. |
tests/unittests/server/test_openai_protocol.py |
Updates/extends unit coverage for reasoning emission and promotion when thinking is disabled. |
tests/unittests/server/test_agui_protocol.py |
Updates unit coverage for reasoning events and ordering when thinking is disabled. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if event.event == EventType.REASONING: | ||
| if thinking_enabled: | ||
| reasoning_content = event.data.get("delta", "") | ||
| if reasoning_content: | ||
| has_text = True | ||
| yield self._build_chunk( | ||
| context, | ||
| {"reasoning_content": reasoning_content}, | ||
| ) | ||
| reasoning_content = event.data.get("delta", "") | ||
| if reasoning_content: | ||
| has_text = True | ||
| yield self._build_chunk( |
| if event.event == EventType.TEXT: | ||
| content_parts.append(event.data.get("delta", "")) | ||
| reasoning_content = get_reasoning_content(event.addition or {}) | ||
| if thinking_enabled and reasoning_content: | ||
| if reasoning_content: | ||
| reasoning_parts.append(reasoning_content) |
The protocol handlers now surface reasoning_content when the model output contains it, independent of MODEL_PARAMETER_RULES. Keep the e2e contract aligned with that behavior so the false thinking flag case verifies passthrough instead of suppression. Constraint: Requirement moved env thinking control to model invocation, while protocol conversion follows actual returned reasoning_content. Rejected: Restore protocol-level env gating | It would contradict the current reasoning_content passthrough requirement. Confidence: high Scope-risk: narrow Directive: Do not reintroduce protocol-level MODEL_PARAMETER_RULES gating for reasoning_content without a new product decision. Tested: uv sync --python 3.10 --dev --all-extras; uv run --python 3.10 pytest tests/e2e/test_reasoning_protocol.py -q; git diff --check Change-Id: I806b351980a716b4e0395414b0896ddd603747bc Not-tested: Full cloud-backed e2e suite locally; waiting for GitHub Actions E2E.
The previous E2E run passed all reasoning protocol checks but failed one cloud sandbox call with RemoteDisconnected from the service side. Trigger a fresh run without changing behavior. Constraint: GitHub integration cannot rerun failed jobs: Resource not accessible by integration. Rejected: Change sandbox implementation or test expectation | The failure is outside the reasoning_content change and occurred after the same cloud test had already passed its async variant. Confidence: medium Scope-risk: narrow Directive: Remove this empty retry commit before merge if maintainers prefer a cleaner history, or squash it with the e2e expectation commit. Tested: Prior run 26796842299 showed tests/e2e/test_reasoning_protocol.py all passed before one sandbox service RemoteDisconnected failure. Change-Id: Idd9a2618187c61b6bbfb360c6f8de2cfb719974a Not-tested: Fresh GitHub E2E result pending.
OhYee
left a comment
There was a problem hiding this comment.
Review: PR #112 — Return reasoning content from model output
结论:LGTM ✅
整体思路正确:MODEL_PARAMETER_RULES 应该控制发给模型的参数(是否启用 thinking),而不应该在协议层过滤模型已经返回的 reasoning 输出。如果模型返回了 reasoning content,SDK 应该如实透传。
改动要点
- 关注点分离清晰 —
is_thinking_enabled_from_env()从 OpenAI / AG-UI 协议层完全移除,不再做输出侧 gating - 重命名恰当 —
_apply_reasoning_gate→_promote_reasoning_content,语义更准确:从additional_kwargs提取 → 清理传输字段 → 提升到顶层 - AG-UI
_strip_reasoning_from_addition— 移除thinking_enabled参数后,空 dict 统一返回None,更干净 - 测试全面 — 单元测试和 E2E 测试都已更新,覆盖 thinking disabled 场景下仍然输出 reasoning 的新行为
关于 Copilot 的两个 comment
- sent_role 问题:这是 pre-existing behavior,本 PR 未改变事件顺序,不是 regression
- E2E 测试未更新:已经在
test_reasoning_protocol.py中更新了,Copilot 判断有误
🤖 Reviewed by Cortex + Claude Code
Summary
MODEL_PARAMETER_RULESgating from OpenAI Chat Completions and AG-UI reasoning output conversion.reasoning_contentpayload is non-empty.additional_kwargs.reasoning_content, while stripping the nested transport-only field from protocol payloads.Validation
uv run --python 3.11 --dev --extra server pytest -q tests/unittests/server/test_openai_protocol.py tests/unittests/server/test_agui_protocol.py tests/unittests/server/test_reasoning.pyuv run --python 3.11 --dev --extra server pytest -q tests/unittests/serveruv run --python 3.11 --dev --extra server ruff check agentrun/server/openai_protocol.py agentrun/server/agui_protocol.py tests/unittests/server/test_openai_protocol.py tests/unittests/server/test_agui_protocol.pygit diff --checkNotes
MODEL_PARAMETER_RULEScan still be used by the runtime/model-call layer to control whether the model thinks. The SDK response protocol layer now trusts the actual model output: if reasoning is returned, it is surfaced; if reasoning is absent or empty, no reasoning field/event is emitted.